A MathLang Path into a Coq Proof Skeleton

نویسندگان

  • Fairouz Kamareddine
  • Joe B. Wells
  • Christoph Zengler
چکیده

The computerization of mathematical texts is often a tedious and manual task. The MathLang System was developed to carry out this process in gradual steps. The idea is that the user can annotate an existing mathematical text with different types of information (grammatical/logical/rhetorical/etc.) and MathLang generates different computerised versions of this text which accommodate different levels of formality. So far there are paths from MathLang into skeletons for the Mizar and Isabelle proof checkers. In this paper we add new reasoning methods to MathLang’s Document Rhetorical aspect (DRa) and develop a new path from an annotated text into a proof skeleton for the Coq proof assistant. We test and evaluate our new approach with the help of the first chapter of Landau’s “Grundlagen der Analysis”. 1 Background and motivation MathLang is intended to support different degrees of formalisation and aims to make easier the partial or full formalisation of mathematical texts in some foundation. Furthermore, for documents where full formalisation is a goal, MathLang is intended to allow this to be accomplished in gradual steps. Full formalisation is sometimes desirable, but also is often undesirable due to its expense and the requirement to commit to many inessential foundational details. Partial formalisation is sometimes desirable for various reasons: it can be helpful with automated checking, semantics-based searching and querying, and interfacing with computer algebra systems (and other mathematical computation environments). In MathLang partial formalisation can be carried out to different degrees: – The abstract syntax trees of symbolic formulas can be represented accurately. This is usually missing in systems like LTEX or Presentation MathML, while more semantically oriented systems provide this to some degree. This can be used to provide editing support for algebraic rearrangements and simplifications, and can help with interfacing with computer algebra systems. – The mathematical structure of natural language text can be represented in a way similar to how symbolic formulas are handled. Furthermore, mixed text and symbols can be handled. This can help in the same way as capturing the structure of symbolic formulas can help. – A weak type system can be used to check simple grammatical conditions without checking full semantic sensibility. – Justifications (inside proofs/between formal statements) can be linked (without always stating precisely how they are used). Uses of this feature include: • Extracting only those parts of a document that are relevant to specific results. (This could be useful in educational systems.) 2 Kamareddine, Wells, Zengler • Checking that instances of circular reasoning are handled via induction. • Calculating proof gaps as a first step toward fuller formalisation. – If one commits to a foundation (or a family of foundations), one can start to use sophisticated type systems for checking more aspects of well-formedness. The design of MathLang is (currently) divided into three aspects : – The Core Grammatical aspect (CGa) [4,7] takes the best features of Weak Type Theroy [5] and MV [1] and enhances the nouns and adjectives of WTT with ideas from object-oriented programming so that nouns are more like classes and adjectives are more like mixins. In CGa, the different kinds of name-introducing forms of WTT are unified; all definitions by default have indefinite forward scope and a local scope operator is used to allow local definitions. The basic unit becomes the step, which can be either a definition, a statement (a phrase that asserts something), or a block which is merely a grouping of steps. We have nine different kinds of CGa annotations: term set noun adjective statement declaration definition step context . CGa provides a grammar for well-formed mathematics with grammatical categories and allows checking for basic well-formedness conditions (e.g., the origin of all names and symbols can be tracked). <> <∃ >There is <> <0>an element 0 in R such that <=> <+> a + <0>0 = a ∃( 0 : R, = ( + ( a, 0 ), a ) ) Fig. 1. Example of CGa encoding of mathematician’s text – The Text and Symbol aspect (TSa) [6,7,2] allows integrating normal typesetting and authoring software with the mathematical structure represented with CGa. TSa allows weaving together usual mathematical authoring representations such as LTEX, XML, or TEXMACS with CGa data. Thanks to a notion of souring rules (called “souring” because it does the opposite of syntactic sugaring), TSa allows the structure of the mathematical text to follow the structure of mathematics as conceived by the mathematician. – The Document Rhetorical aspect (DRa) [3,8] supports identifying portions of a text and expressing the relationships between them. Any portion of text (e.g., a phrase, a step, a block, etc.) can be given an identity and relationships can be expressed between identified pieces of text. For example, a chunk of text can be identified as a “theorem”, and another as the “proof” of that theorem. Similarly, one chunk of text can be a “subsection” or “chapter” of another. This way, it is possible to do computations to check whether i) all dependencies are identified, ii) the relationships are sensible/problematic (hence whether the author should be warned), and to extract/explain the logical structure of a text. Such dependencies have been used in generating formal proof sketches and identifying the proof holes that remain to be filled. In addition to the design of MathLang itself, we have worked on relating a MathLang text to a fully formalised version of the text. Using a CGa and DRa annotated text, we have given in [3] a procedure for producing a corresponding Mizar document, first as a proof sketch with holes and then as a fully completed proof. We have also worked on doing this with Isabelle [6]. Figure 2 diagrams the paths in MathLang. In this paper, we make the following progress: A MathLang Path into a Coq Proof Skeleton 3 Fig. 2. Overall situation of work in MathLang – Extending the formalisation and implementation of the DRa. – Completing the path in MathLang in order to reach full formalisation. – Introducing a third theorem prover (Coq) as a test bed for MathLang. – Developing the Mizar proof skeleton of [3] into an automatically generated proof skeleton in a choice of theorem provers (Mizar, Isar and Coq, etc.). To achieve this, we give a generic algorithm for automatic proof skeleton generation which takes a DRa tree and the required prover as arguments. – Giving hints for developing a generic algorithm to automatically convert parts of a CGa annotated text into the syntax of the prover in question. – Giving an extensive example of how the mathematician’s text passes through all the stages of MathLang from the original into the fully formalised text. 2 Extended Formalisation and Implementation of DRa The DRa structure of a text can be represented as a tree (which is exactly the tree of the XML representation of the DRa annotated MathLang document). Due to this tree structure, we refer to an annotated part of a text as a DRa node (e.g., see figure 11). The role of this node is declaration and its name is decA. Note that the content of a DRa node is the user’s CGa and TSa annotation. In the DRa annotation of a document, there is a dedicated root node (the Document node) where each top-level DRa node is a child of this root node. For example in figure 4, the tree has 10 nodes. The root node (labelled Document) has four children nodes and five grandchildren nodes (all children of B). We distinguish between proved nodes (theorem, lemma, etc.) which have a solid line in the picture and unproved nodes (axiom, definition, etc.) which have a broken line. In order to check a DRa annotated document for validity, the information whether a node is to be proved or not is important. For example such information returns an error if someone tries to prove an unproved node e.g. a definition or an axiom. When document D2 references document D1 it can reference the root node D1 to include all of its mathematical text. In figure 3 (taken from [3]), there are four top-level nodes: A, B, C and D, representing respectively lemma 1, its proof, corollary 2 its proof. The proof of lemma 1 has five children: E, F, G, H, I representing respectively the definition of the predicate, a claim, the proof of the claim, cases 1 and 2. The visual representation of this tree is on the lefthand-side of figure 4. 4 Kamareddine, Wells, Zengler Lemma 1 For m, n ∈ N one has: m2 = 2n2 =⇒ m = n = 0 A

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computerising Mathematical Text with MathLang

Mathematical texts can be computerised in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangement of black marks on paper, while at the other end there are proof assistants (e.g., Mizar, Isabelle, Coq, etc.), which capture the full mathematical meaning and have proofs expressed in a formal foundation of mathemat...

متن کامل

Gradual Computerisation/Formalisation of Mathematical Texts into Mizar

We explain in this paper the gradual computerisation process of an ordinary mathematical text into more formal versions ending with a fully formalised Mizar text. The process is part of the MathLang–Mizar project and is divided into a number of steps (called aspects). The first three aspects (CGa, TSa and DRa) are the same for any MathLang–TP project where TP is any proof checker (e.g., Mizar, ...

متن کامل

DRAFT r 2462 - - DRAFT r 2462 - - DRAFT r 2462 - - DRAFT r 2462 - - Gradual computerisation / formalisation of mathematical texts into Mizar

We explain in this paper the gradual computerisation process of an ordinary mathematical text into more formal versions ending with a fully formalised Mizar text. The process is part of the MathLang–Mizar project and is divided into a number of steps (called aspects). The first three aspects (CGa, TSa and DRa) are the same for any MathLang–TP project where TP is any proof checker (e.g., Mizar, ...

متن کامل

Computerizing Mathematical Text with MathLang

Mathematical texts can be computerized in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangement of black marks on paper, while at the other end there are proof assistants (e.g., Mizar, Isabelle, Coq, etc.), which capture the full mathematical meaning and have proofs expressed in a formal foundation of mathemat...

متن کامل

Toward an Object-Oriented Structure for Mathematical Text

Computerizing mathematical texts to allow software access to some or all of the texts’ semantic content is a long and tedious process that currently requires much expertise. We believe it is useful to support computerization that adds some structural and semantic information, but does not require jumping directly from the word-processing level (e.g., LTEX) to full formalization (e.g., Mizar, Co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011